Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reserve maxSets OR descriptorPoolSizes if either are required. Ensure… #961

Closed
wants to merge 18 commits into from

Conversation

rolandhill
Copy link
Contributor

MaxSets and DescriptorSetPoolSizes would only be reserved if BOTH were required. This PR reserves additional resources if EITHER are required and ensures that at least 1 maxSet is reserved.

Warning in Context::reserve() is no longer issued when either required_maxSets or required_descriptorPoolSizes is 0

Tested on Kubuntu in my own application that used to throw warnings.

@@ -161,6 +161,7 @@ void Context::getDescriptorPoolSizesToUse(uint32_t& maxSets, DescriptorPoolSizes
{
maxSets = minimum_maxSets;
}
maxSets = std::max(1u, maxSets);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially equivalent to setting minimum_maxSets to 1.

@@ -244,7 +245,7 @@ void Context::reserve(const ResourceRequirements& requirements)
if (required_maxSets > 0 || !required_descriptorPoolSizes.empty())
{
getDescriptorPoolSizesToUse(required_maxSets, required_descriptorPoolSizes);
if (required_maxSets > 0 && !required_descriptorPoolSizes.empty())
if (required_maxSets > 0 || !required_descriptorPoolSizes.empty())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recent changes were made because we were seeing Vulkan debug errors when maxSets was 0 or the pool sizes vector was empty so checking both using an && was required.

@rolandhill
Copy link
Contributor Author

Hi Robert,

As it stands, context::reserve() will work in some cases, but throw a warning in the case where maxSets == 0 OR DescriptorPoolSizes == 0, but not both. While there are work-arounds, this is a situation that might only appear in production code when the user deletes some objects in the scene and creates others with different descriptor requirements. If you have an application that only releases and compiles objects of the same type then it won't show up because you are releasing and compiling the same number of descriptors each time (eg image tiles). When there are many different object types with different descriptor requirements and all the content is dynamically loaded and compiled, then the issue shows up a lot.

The changes in this PR make context::reserve() work all the time. The change you mention above to check if maxSets > 0 is only a work around to prevent the warning, but this it what creates the problem of context::reserve() not working in all situations, plus it then means that DescriptorPoolSizes aren't being reserved if sufficient maxSets (ie required_maxSets == 0) are available. This is all fixed by maxSets = std::max(1u, maxSets) in getDescriptorPoolSizesToUse().

Yes, you can work around this requirement as well by ensuring that every compilation request is accompanied by a minimum_maxSets hint, but you would need to do it for every object being compiled and it is only to mitigate that context::reserve() isn't working properly.

With these changes, all the dynamic content in your sceneGraph can be compiled with

auto result = _viewer->compileManager->compile(vis->_stagingRoot);

before merging without any need to involve ResourceHints at all because compileManager->compile will run CollectResourceRequirements anyway and it provides all the information required.

Everything in my scene graph is generated dynamically after the initial viewer->compile(), including tiled DEM, various octree leaf nodes and on-request text labels and it all works without resource requirement hints using the simple compilation line above, once the changes in this PR are applied, so it is fairly well tested. I can't see that there would be any regression for existing code.

The discussion comment here #960 (comment) showed some examples of when context::reserve fails. These are just a few calls that I pulled out to demonstrate the combinations and results.

Regards,
Roland

@siystar
Copy link

siystar commented Sep 16, 2023

While there are work-arounds, this is a situation that might only appear in production code when the user deletes some objects in the scene and creates others with different descriptor requirements. If you have an application that only releases and compiles objects of the same type then it won't show up because you are releasing and compiling the same number of descriptors each time (eg image tiles). When there are many different object types with different descriptor requirements and all the content is dynamically loaded and compiled, then the issue shows up a lot.

I am seeing the same and find it's caused by the DescriptorPool's _recyclingList inflating the availability statistics with sets that aren't actually available to use unless a compatible descriptor set layout is used. Removing the recycling of descriptor sets, i.e. immediately calling vkFreeDescriptorSets when a vsg::DescriptorSet object is destroyed, should help. Recycling VkDescriptorSet objects on the VSG side may be completely unnecessary. In my understanding, the VkDescriptorPool can implement something similar (or even better) under the hood, with comparable efficiency as it's an externally synchronized object, and the freedom to distribute the available pool sizes among descriptor sets as needed.

@siystar
Copy link

siystar commented Sep 16, 2023

Note, the existence of multiple vsg::DescriptorPool objects, with each pool having some resources to spare, but not enough to fit the required descriptor set, can also inflate the statistics because the available sets and pool sizes from each descriptorPool are merged when checking for availability. Such a configuration of DescriptorPools is readily produced because of the issues with the _recyclingList described above.

@robertosfield
Copy link
Collaborator

I need to complete other complex work before I start thinking other complex topics so while I haven't yet dived into the topic I will, but won't be able to for a week or so. When I get the opportunity I'll dive into the topic fully and refactor what needs to be improved. At that point I'll need test cases to help scope out the current issues and to confirm they have been resolved. However, right now I have to set all this work and thinking about this topic aside. Thanks for your patience.

@siystar
Copy link

siystar commented Sep 16, 2023

If we decide to remove the DescriptorPool:: _recyclingList then we'll need to add handling of VK_ERROR_FRAMENTATION/VK_ERROR_OUT_OF_POOL_MEMORY, as calls to vkAllocateDescriptorSets may fail for that reason unless vkFreeDescriptorSets has never been called on the pool. It follows that, whatever VkDescriptorPool is doing under the hood can be prone to fragmentation as well, though I doubt it's as severely an issue.

On the other hand, a different solution that allows us to retain the recycling of descriptor sets could be to divide the descriptor pools into sets of pools according to the DescriptorSetLayout that is required. When allocating a new DescriptorPool, it's allocated for a specific DescriptorSetLayout and will only ever be used with that layout, or an identically defined layout, which guarantees that descriptor sets in the _recyclingList are usable and avoids the fragmentation issues provoked by the use of different descriptor set layouts. From the application developer's point of view, it's then no longer required to specify a minimum_descriptorPoolSizes, they can be automatically calculated from the DescriptorSetLayout and minimum_maxSets.

For applications using common descriptor set layouts for most of their pipelines, such a partioning of DescriptorPools should be very efficient. In my application, I decided to use the same layout for all shaders in a shader set. While many, or even most of the descriptor bindings will be unused by some shaders, using static layouts somewhat simplifies the code and reduces the number of Vulkan objects that need to be created. The Vulkan spec for vkCreatePipelineLayout even states that sharing layouts may improve performance:

"The pipeline layout can include entries that are not used by a particular pipeline. The pipeline layout allows the application to provide a consistent set of bindings across multiple pipeline compiles, which enables [...] that the implementation may cheaply switch pipelines without reprogramming the bindings."

theodoregoetz added a commit to theodoregoetz/VulkanSceneGraph that referenced this pull request May 29, 2024
Following the discussions in vsg-dev#960 and vsg-dev#961 this sets the minimum maxSets to 1 so that when switching out graphics pipelines that use different descriptor sets, the warning about Context::reserve() is avoided
@robertosfield
Copy link
Collaborator

@rolandhill I've finally cleared some time to investigate this issue properly. This particular PR has so many commits associated with it I can't spot the important one. Could you point me at the commit or apply the changes to VSG and generate a PR?

There is also #1203 that addresses the same problem, could you have a look at?

I have recreated the problem using vsgviewer models/openstreetmap.vsgt by just zooming and then throwing the camera so earth rotates beneath and eventually

"Warning: Context::reserve(const ResourceRequirements& requirements) invalid combination of required_maxSets (0) & required_descriptorPoolSizes (2) unable to allocate DescriptorPool."

Starts popping up, I've recorded a camera path for this so can reliably recreate it. I'll be using this as a test for reproducing the problem and then testing potential fixes. Happy to add other tests as well.

@rolandhill
Copy link
Contributor Author

rolandhill commented Jul 9, 2024 via email

@rolandhill
Copy link
Contributor Author

I've synced everything now. There are a lot of commits listed, mainly rebasing I think, you only need to look at the first one

4d802e4

or check the Files Changed tab and it shows just the two lines that have actually changed.

Cheers,
Roland

@rolandhill
Copy link
Contributor Author

If you're testing by spinning a globe then you are just paging in and out tiles with the same descriptor requirements and therefore not testing all permutations of required_maxSets and required_descriptorPoolSizes requirements. This is why the && needs to be changed to ||, as it is in the PR.

-       if (required_maxSets > 0 && !required_descriptorPoolSizes.empty())
+       if (required_maxSets > 0 || !required_descriptorPoolSizes.empty())

@robertosfield
Copy link
Collaborator

Thanks @rolandhill, I'll test your commit. 4d802e4

I've put debug info into DescriptorSet/DescriptorPool/Context and am studying what is happening prior to Context reporting issues as I'd like to properly understanding what is happening. One thing of note is when database paging I'm getting lots of small DescriptorPool allocated, something that is inefficient so I'd like to improve this area at the same time as tackling the error condition.

I am considering creating a DescriptorPools class that has a list of DescriptorPool, rather than having Context do this. this DescriptorPools container class would then be available via vsg::Device as well as from Context. This would bring it inline with the DeviceMemoryPools class that now has this capabilities.

I'm still learning/experimenting but should get this all resolved this week.

@robertosfield
Copy link
Collaborator

I have created a DescriptorPool branch that has a new vsg::DescriptorPools class that has a list of DescriptorPool.

https://github.com/vsg-dev/VulkanSceneGraph/tree/DescriptorPool

This immediately fixes an issue with the previous code where it wasn't sharing the first DescriptorPool created by the application, and the creation of new DescriptorPool with the minimum sizes set during initialization.

I am still reviewing the issues raised in this PR, the new branch has the proposed fixes in this PR using a #define ROLAND_HILL_FIX to toggle between the old the and proposed code paths.

@robertosfield
Copy link
Collaborator

I'm now quite advanced with a refactor of how vsg::DescriptorPool are managed with the introduction of new vsg::DescriptorPools class. The implementation should already address the issues tackled by this PR so should make this PR redudent. Could you please test the DescriptorPool branch of the VSG to check that things are working correctly for you application? Thanks.

@robertosfield
Copy link
Collaborator

I have now completed my work on vsg::DescriptorPools and have merged it with VSG master with the PR #1239. I will now close this PR as it should no longer be required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants